Adaptively-Branching Fuzzy Greedy K-mean Decision Forest (FGK-DF) Model for Protein Local Tertiary Structure Prediction

نویسندگان

  • Bernard Chen
  • Dongsheng Che
  • Cody Hudson
  • Aaron Crawford
  • Minwoo Kim
چکیده

For the past twenty years, protein tertiary structure research has been given much attention, with solutions existing for both wet lab procedures (x-ray crystallography and NMR spectroscopy) and bioinformatics approaches (threading, homology-modeling, and de novo). Unfortunately, each approach has significant shortcomings, such as necessary time, capital, expertise (for wet lab procedures) or restrictions imposed by the method, limiting the resolution or novelty of produced tertiary structures (for bioinformatics approaches). This work propose the Adaptively-Branching Fuzzy Greedy K-means-Decision Forest (FGK-DF) model, which utilizes conserved sequential and structural motifs that transcend protein family boundaries, to predict the local tertiary structure of proteins with unknown structures. In this work, the FGK-DF model is conceptually compared against existing approaches and explicitly compared against the Super Granule Support Vector Machine approach (Super GSVM), with accuracy and coverage results highlighted.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein Sequence Motif Information Generated by Fuzzy - Hybrid Hierarchical K-means Clustering Algorithm

Recurring amino acids sequence patterns are referred to as protein sequence motifs. The recurring patterns are so important because the conserved regions have the potential to reveal the role of the protein itself. In this paper, we modify the FGK model and apply the Hybrid Hierarchical K-means (HHK) clustering algorithm, which is a hybrid combination of Agglomerative Hierarchical Clustering an...

متن کامل

In Silico Prediction and Docking of Tertiary Structure of Multifunctional Protein X of Hepatitis B Virus

Hepatitis B virus (HBV) infection is a universal health problem and may result into acute, fulminant, chronic hepatitis liver cirrhosis, or hepatocellular carcinoma. Sequence for protein X of HBV was retrieved from Uniprot database. ProtParam from ExPAsy server was used to investigate the physicochemical properties of the protein. Homology modeling was carried out using Phyre2 server, and refin...

متن کامل

Fgk Model: an Efficient Granular Computing Model for Protein Sequence Motifs Information Discovery

Discovering protein sequence motif information is one of the most crucial tasks in bioinformatics research. In this paper, we try to obtain protein recurring patterns which are universally conserved across protein family boundaries. In order to achieve the goal, our dataset is extremely large. Therefore, an efficient technique is required. In this article, short recurring segments of proteins a...

متن کامل

In silico Prediction and Docking of Tertiary Structure of LuxI, an Inducer Synthase of Vibrio fischeri

Background: LuxI is a component of the quorum sensing signaling pathway in Vibrio fischeri responsible for the inducer synthesis that is essential for bioluminescence. Methods: Homology modeling of LuxI was carried out using Phyre2 and refined with the GalaxyWEB server. Five models were generated and evaluated by ERRAT, ANOLEA, QMEAN6, and Procheck. Results: Five refined models were gener...

متن کامل

Inductive data mining: automatic generation of decision trees from data for QSAR modelling and process historical data analysis

A new inductive data mining method for automatic generation of decision trees from data (GPTree) is presented. Compared with other decision tree induction techniques that are based upon recursive partitioning employing greedy searches to choose the best splitting attribute and value at each node therefore will necessarily miss regions of the search space, GPTree can overcome the problem. In add...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014